32 research outputs found
NetLSD: Hearing the Shape of a Graph
Comparison among graphs is ubiquitous in graph analytics. However, it is a
hard task in terms of the expressiveness of the employed similarity measure and
the efficiency of its computation. Ideally, graph comparison should be
invariant to the order of nodes and the sizes of compared graphs, adaptive to
the scale of graph patterns, and scalable. Unfortunately, these properties have
not been addressed together. Graph comparisons still rely on direct approaches,
graph kernels, or representation-based methods, which are all inefficient and
impractical for large graph collections.
In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD):
the first, to our knowledge, permutation- and size-invariant, scale-adaptive,
and efficiently computable graph representation method that allows for
straightforward comparisons of large graphs. NetLSD extracts a compact
signature that inherits the formal properties of the Laplacian spectrum,
specifically its heat or wave kernel; thus, it hears the shape of a graph. Our
evaluation on a variety of real-world graphs demonstrates that it outperforms
previous works in both expressiveness and efficiency.Comment: KDD '18: The 24th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, August 19--23, 2018, London, United Kingdo
On the Robustness of Post-hoc GNN Explainers to Label Noise
Proposed as a solution to the inherent black-box limitations of graph neural
networks (GNNs), post-hoc GNN explainers aim to provide precise and insightful
explanations of the behaviours exhibited by trained GNNs. Despite their recent
notable advancements in academic and industrial contexts, the robustness of
post-hoc GNN explainers remains unexplored when confronted with label noise. To
bridge this gap, we conduct a systematic empirical investigation to evaluate
the efficacy of diverse post-hoc GNN explainers under varying degrees of label
noise. Our results reveal several key insights: Firstly, post-hoc GNN
explainers are susceptible to label perturbations. Secondly, even minor levels
of label noise, inconsequential to GNN performance, harm the quality of
generated explanations substantially. Lastly, we engage in a discourse
regarding the progressive recovery of explanation effectiveness with escalating
noise levels
Knowledge-augmented Graph Machine Learning for Drug Discovery: A Survey from Precision to Interpretability
The integration of Artificial Intelligence (AI) into the field of drug
discovery has been a growing area of interdisciplinary scientific research.
However, conventional AI models are heavily limited in handling complex
biomedical structures (such as 2D or 3D protein and molecule structures) and
providing interpretations for outputs, which hinders their practical
application. As of late, Graph Machine Learning (GML) has gained considerable
attention for its exceptional ability to model graph-structured biomedical data
and investigate their properties and functional relationships. Despite
extensive efforts, GML methods still suffer from several deficiencies, such as
the limited ability to handle supervision sparsity and provide interpretability
in learning and inference processes, and their ineffectiveness in utilising
relevant domain knowledge. In response, recent studies have proposed
integrating external biomedical knowledge into the GML pipeline to realise more
precise and interpretable drug discovery with limited training instances.
However, a systematic definition for this burgeoning research direction is yet
to be established. This survey presents a comprehensive overview of
long-standing drug discovery principles, provides the foundational concepts and
cutting-edge techniques for graph-structured data and knowledge databases, and
formally summarises Knowledge-augmented Graph Machine Learning (KaGML) for drug
discovery. A thorough review of related KaGML works, collected following a
carefully designed search methodology, are organised into four categories
following a novel-defined taxonomy. To facilitate research in this promptly
emerging field, we also share collected practical resources that are valuable
for intelligent drug discovery and provide an in-depth discussion of the
potential avenues for future advancements
SOFOS: Demonstrating the Challenges of Materialized View Selection on Knowledge Graphs
Analytical queries over RDF data are becoming prominent as a result of the
proliferation of knowledge graphs. Yet, RDF databases are not optimized to
perform such queries efficiently, leading to long processing times. A well
known technique to improve the performance of analytical queries is to exploit
materialized views. Although popular in relational databases, view
materialization for RDF and SPARQL has not yet transitioned into practice, due
to the non-trivial application to the RDF graph model. Motivated by a lack of
understanding of the impact of view materialization alternatives for RDF data,
we demonstrate SOFOS, a system that implements and compares several cost models
for view materialization. SOFOS is, to the best of our knowledge, the first
attempt to adapt cost models, initially studied in relational data, to the
generic RDF setting, and to propose new ones, analyzing their pitfalls and
merits. SOFOS takes an RDF dataset and an analytical query for some facet in
the data, and compares and evaluates alternative cost models, displaying
statistics and insights about time, memory consumption, and query
characteristics
VERSE: Versatile Graph Embeddings from Similarity Measures
Embedding a web-scale information network into a low-dimensional vector space
facilitates tasks such as link prediction, classification, and visualization.
Past research has addressed the problem of extracting such embeddings by
adopting methods from words to graphs, without defining a clearly
comprehensible graph-related objective. Yet, as we show, the objectives used in
past works implicitly utilize similarity measures among graph nodes.
In this paper, we carry the similarity orientation of previous works to its
logical conclusion; we propose VERtex Similarity Embeddings (VERSE), a simple,
versatile, and memory-efficient method that derives graph embeddings explicitly
calibrated to preserve the distributions of a selected vertex-to-vertex
similarity measure. VERSE learns such embeddings by training a single-layer
neural network. While its default, scalable version does so via sampling
similarity information, we also develop a variant using the full information
per vertex. Our experimental study on standard benchmarks and real-world
datasets demonstrates that VERSE, instantiated with diverse similarity
measures, outperforms state-of-the-art methods in terms of precision and recall
in major data mining tasks and supersedes them in time and space efficiency,
while the scalable sampling-based variant achieves equally good results as the
non-scalable full variant.Comment: In WWW 2018: The Web Conference. 10 pages, 5 figure
Spectral Graph Complexity
We introduce a spectral notion of graph complexity derived from the Weyl's
law. We experimentally demonstrate its correlation to how well the graph can be
embedded in a low-dimensional Euclidean space.Comment: BigNet workshop at the Web conferece'201
How Faithful are Self-Explainable GNNs?
Self-explainable deep neural networks are a recent class of models that can
output ante-hoc local explanations that are faithful to the model's reasoning,
and as such represent a step forward toward filling the gap between
expressiveness and interpretability. Self-explainable graph neural networks
(GNNs) aim at achieving the same in the context of graph data. This begs the
question: do these models fulfill their implicit guarantees in terms of
faithfulness? In this extended abstract, we analyze the faithfulness of several
self-explainable GNNs using different measures of faithfulness, identify
several limitations -- both in the models themselves and in the evaluation
metrics -- and outline possible ways forward